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Abstract 

In this paper we present a new ultra efficient numerical method for solving ki- 
netic equations. In this preliminary work, we present the scheme in the case of the 
BGK relaxation operator. The scheme, being based on a splitting technique between 
transport and collision, can be easily extended to other collisional operators as the 
Boltzmann collision integral or to other kinetic equations such as the Vlasov equation. 
The key idea, on which the method relies, is to solve the collision part on a grid and 
then to solve exactly the transport linear part by following the characteristics back- 
ward in time. The main difference between the method proposed and semi-Lagrangian 
methods is that here we do not need to reconstruct the distribution function at each 
time step. This allows to tremendously reduce the computational cost of the method 
and it permits for the first time, to the author's knowledge, to compute solutions of 
full six dimensional kinetic equations on a single processor laptop machine. Numeri- 
cal examples, up to the full three dimensional case, are presented which validate the 
method and assess its efficiency in ID, 2D and 3D. 

Keywords: Kinetic equations, discrete velocity models, semi Lagrangian schemes, 
Boltzmann-BGK equation, 3D simulation. 



1 Introduction 

The kinetic equations provide a mesoscopic description of gases and more generally of 
particle systems. In many applications, the correct physical solution for a system far from 
thermodynamical equilibrium, such as rarefied gases or plasmas, requires the resolution of 
a kinetic equation [7]. However, the numerical simulation of these equations with deter- 
ministic techniques presents several drawbacks due to the large dimension of the problem. 
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The distribution function depends on seven independent variables: three coordinates in 
physical space, three coordinates in velocity space and the time. As a consequence, prob- 
abilistic techniques such as Direct Simulation Monte Carlo (DSMC) methods jH El EJ [29] 
are extensively used in real situations due to their large flexibility and low computational 
cost compared to finite volume, finite difference or spectral methods for kinetic equations 
[171 [181 [281 |3T1 [35] . On the other hand, DSMC solutions are affected by large fluctuations. 
Moreover, in non stationary situations it is impossible to use time averages to reduce these 
fluctuations and this leads to, either poorly accurate solutions, or again to computationally 
expensive simulations. 

For this reason, many different works have been dedicated to reduce some of the 
disadvantages of Monte Carlo methods. We quote [5] for an overview on efficient and 
low variance Monte Carlo methods. For applications of variance reduction techniques to 
kinetic equation let us remind to the works of Homolle and Hadjiconstantinou |21j and 
|22j . We mention also the work of Boyd and Burt |1] and of Pullin [36] who developed a 
low diffusion particle method for simulating compressible inviscid flows. We finally quote 
the works of Dimarco and Pareschi [141 [T5] and of Degond, Dimarco and Pareschi [12] who 
constructed efficient and low variance methods for kinetic equations in transitional and 
general regimes. 

In this work, we consider the development of a new deterministic method to solve 
kinetic equations. In particular, we focus on the development of efficient techniques for 
the discretization of the linear transport part of these equations. The proposed method 
is based on the so-called discrete velocity models (DVM) |28| and on the semi Lagrangian 
approach [8l [9] . The DVM models are obtained by discretizing the velocity space into a 
set of fixed discrete velocities [31 |28l 1311 [32] . As a result of this discretization, the orig- 
inal kinetic equation is then represented as a set of linear transport equations plus an 
interaction term which couples all the equations. In order to solve the resulting set of 
equations, the most common strategy consists in an operator splitting strategy [IQ]: The 
solution in one time step is obtained by the sequence of two stages. First one integrates 
the space homogeneous equations and then, in the second stage, the transport equation 
using the output of the previous step as initial condition. More sophisticated splitting 
techniques can be employed, which permits to obtain high order in time discretizations 
of the kinetic equations as for instance the Strang splitting method [37j. In any case, the 
resulting method is very simple and robust but the main drawback is again the excessive 
computational cost. It is a matter of fact that the numerical solution through such mi- 
croscopic models and deterministic schemes remains nowadays too expensive especially in 
multi-dimensions even with the use of super-computers. 

To overcome this problem, we propose to use a Lagrangian technique which exactly 
solves the transport stage on the entire domain and then to project the solution on a grid 
to compute the contribution of the collision operator. The resulting scheme shares many 
analogies with semi-Lagrangian methods [51 [HI [E] and with Monte Carlo schemes [24J , as 
we will explain, but on the contrary to them, the method is as fast as a particle method 
while the numerical solution remains fully deterministic, which means that there is no 
source of statistical error. Thanks to this approach we are able to compute the solution 
of the full six dimensional kinetic equation on a laptop. This is, up to our knowledge, the 
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first time that the fuh kinetic equation has been solved with a deterministic scheme on 
a single processor machine for acceptable mesh sizes and in a reasonable amount of time 
(around ten hours for 100^ space x 12^ velocity space mesh points). 

In this first work, we consider a simple collision operator, i.e. the BGK (Bhatnagar- 
Gross-Krook) relaxation operator [20j . The extension of the method to other operators 
like the Boltzmann one [BE] or to other kinetic equations like the Vlasov equation [21 [18] 
will be considered in future works. At the present moment, the method is designed to 
work on uniform grids, although extensions to other meshes are possible and will be also 
considered in the next future. 

The article is organized as follows. In section [2| we introduce the Boltzmann-BGK 
equations and their properties. In section|3| we present the discrete velocity model (DVM). 
Then in section [4] we present the numerical scheme. Section [5] is devoted to the illustration 
of the analogies between such fast kinetic scheme (FKS) and particle methods. Several 
test problems up to three dimensional test cases which demonstrate the capabilities and 
the strong efficiency of the method are presented and discussed in section [6| Some final 
considerations and future developments are finally drawn in the last section. 



2 Boltzmann-BGK Equation 

We consider the following kinetic equation as a prototype model for developing our method: 

dtf + v-V,f = ^iMf-f), (1) 

with the initial condition 

f{x,v,t = 0) = fo{x,v). (2) 

This is the Boltzmann-BGK equation where / = f{x,v,t) is a non negative function 
describing the time evolution of the distribution of particles which move with velocity v G 
Mf^ in the position x G $7 C M'^ at time t > 0. For simplicity we consider the same dimension 
in space and in velocity space d, however it is possible to consider different dimensions 
in order to obtain different simplified models. In the BGK equation the collisions are 
modeled by a relaxation towards the local thermodynamical equilibrium defined by the 
Maxwellian distribution function Mf. The local Maxwellian function is defined by 

M, = MAP. u, T]iv) = exp j , (3) 

where p G M* and u G M'^ are the density and mean velocity while 9 = RT with T the 
temperature of the gas and R the gas constant. The macroscopic values p,u and T are 
related to / by: 

P = I fdv, u = I vfdv, ^ ~ ~j I 1^ ~ u\'^fdv. (4) 



The energy E is defined by 



E = l [ \v\'fdv = Ip\u\' + ^pd, (5) 
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while the kinetic entropy of / by 



Hi f)= / f log fdv. (6) 

The parameter r > in ([T]) is the relaxation time. In this paper, r is fixed at the beginning 
of each numerical test. Considering relaxation frequencies as functions of the macroscopic 
quantities does not change the numerical scheme we will propose and its behaviors. We 
refer to section [6] for the numerical values chosen. 

If we consider the BGK equation multiplied by 1, v, ^jf^l (the so-called collision 
invariants), and then integrated with respect to v, we obtain the following balance laws: 

^ + ^x- (pu) = 0, 

^ + V^-{pu^u + P) = 0, (7) 
dE 

— + V:,. • {Eu + Pu + q)=0, 

which express the conservation of mass, momentum and total energy, in which P = J^div — 
u) {v — u)fdv is the pressure tensor while q = f^d ^{v — u)\v — up dv is the heat flux. 
Furthermore the following inequality expresses the dissipation of entropy: 

dt( [ flogfdv]+V^-([ vflogfdv)<0. (8) 

System ([T]) is not closed, since it involves other moments of the distribution function than 
just p, pu and E. Let us describe one way to close the system. 

The Maxwellian Mj can be characterized as the unique solution of the following entropy 
minimization problem 



H{Mf) = mm if), f > s.t. j mf dv = 



(9) 



where m and U are the vectors of the collision invariants and of the first three moments 
of / respectively: 

m{v) = (^l,v, ^\v\^^ , U = {p, pu, E). (10) 

This is the well-known local Gibbs principle, and it expresses that the local thermodynam- 
ical equilibrium state minimizes the entropy, in the mathematical sense, of all the possible 
states subject to the constraint that moments U are prescribed. 

Formally, when the number of collision goes to infinity, which means r — )• 0, the 
function / converges towards the Maxwellian distribution. In this limit, it is possible to 
compute the moments P and g of / in terms of p, pu and E. In this way, one can close 
the system of balance laws ([T]) and get the so-called Euler system of compressible gas 



4 



dynamics equations 



1^ + V, • {pu) = 0, 
dpu „ 

-^ + V^-{pu®u + pi) = 0, 

dE (1^) 
— +V,-((^+p)n) = 0, 

p = p9, E ='^pe + ]^p\u\'^. 



3 The Discrete Velocity Model (DVM) 

The principle of Discrete Velocity Model (DVM) is to set a grid in the velocity space and 
to transform the kinetic equation in a set of linear hyperbolic equations with source terms. 
We refer to the work of Mieussens [28j for the description of this model and we remind to 
it for the details. 

Let /C be a set of N multi-indices of N'^, defined hy K = {k = {k^^)f^^, k^^ < K^^], 
where {X*^*^} are some given bounds. We introduce a Cartesian grid V of by 

V = {vk = kAv + a, keIC}, (12) 

where a is an arbitrary vector of M"^ and Av is a scalar which represents the grid step in 
the velocity space. We denote the discrete collision invariants on V by = (1, w^, ^|ffcp). 

Now, in this setting, the continuous distribution function / is replaced by a A^— vector 
fic{x,t), where each component is assumed to be an approximation of the distribution 
function / at location Vk- 

fjc{x,t) = {fk{x,t))k, fkix,t) ^ f{x,Vk,t). (13) 

The fluid quantities are then obtained from fk thanks to discrete summations on V: 

U{x,t) = Y,mkfk{x,t)Av. (14) 

k 

The discrete velocity BGK model consists of a set of N evolution equations for of the 
form 

dtfk + Vk ■ V,/fc = -i£k[U] - fk), (15) 

T 

where if/c[f7] is a suitable approximation of Mf defined next. Two strongly connected and 
important questions arise when dealing with discrete velocity models. The first one is 
about the truncation and boundedness of the velocity space. The second one concerns the 
conservation of macroscopic quantities. 
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Truncation and boundedness of the velocity space. In DVM methods one needs to 
truncate the velocity space and to fix some bounds. This gives the number N of evolution 



equations (15). Of course, the number N is chosen as a compromise between the desired 
precision in the discretization of the velocity space and the computational cost, while the 
bounds are chosen to give a correct representation of the flow. Observe in fact that, the 
macroscopic velocity and temperature are bounded above by velocity bounds. This implies 
that the discrete velocity set must be large enough to take into account large variations 
of the macroscopic quantities which may appear as a result of the time evolution of the 
equations. Moreover, as a consequence of the velocity discretization, we have that the 
temperature is bounded from below. We summarize the above remarks by the following 
statement. Let / be a non negative distribution function, then the macroscopic velocity 
and temperature associated to / in V by 

u=^{vf),c, r=-^(|t;-n|2/k, (16) 
p dnp 

where {.)jc denotes the summation over the set of multi-indices /C, satisfy the bounds |28| 

mini;^,*^ < n^*) < maxu^.*'', Vi=l,...,d (17) 

— — minlf — uP< T < — - max — nP. (18) 
dR K. dR K 



Conservation of macroscopic quantities. Exact conservation of macroscopic quan- 
tities is impossible, because in general the support of the distribution function is non 
compact. Thus, in order to conserve macroscopic variables, different strategies can be 
adopted, two possibilities are described in [121 EH]. Moreover, the approximation of the 
equilibrium distribution Mf with <?fc[f^] must be carefully chosen in order to satisfy the 
conservation of mass, momentum and energy. In the following section we will discuss our 
choices in details. Such choices prevent the lack of conservation of physical quantities. 

Remark 1 Once DVM model is defined as above, the common choice which permits to 
solve the kinetic equation is to discretize the N evolution equations with the preferred finite 
volume or finite difference method 1281 I A' 5]/ . Alternatively, one can reconstruct the 

distribution function in space and then follows the characteristics backward in time to 
obtain the solution of the linear transport equation |3 O \17[ Our choice, described 

in the next section, which enables to drastically decrease the computational cost, consists 
of an exact solution of the linear transport equation avoiding the reconstruction of the 
distribution function. 

4 Fast kinetic schemes (FKS) 

The main features of the method proposed in this work can be summarized as follows: 
• The BGK equation is discretized in velocity space by using the DVM method. 
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A time splitting procedm'e is employed between the transport and the relaxation 
operators for each of the resulting evolution equations (15). First- and second- 
order Strang time splitting [37] are considered. 



• The transport part is exactly solved, which means without using a spatial mesh. 
The initial data of this step is given by the solution of the relaxation operator. 

• The relaxation part is solved on the grid. The initial data for this step is given by 
the value of the distribution function in the center of the cells after the transport 
step. 

Before describing the scheme, we explain how we overcome the drawback of the lack of 
conservation of macroscopic quantities in DVM methods. 

4.1 Conservative methods 

We introduce the conservative method for the initial data and then we extend it to the 
scheme. The initialisation is done in two steps. First we fix 

fk{x,t = 0) = f{x,vk,t = 0), k = l,...,N. (19) 

Observe that, in order to do this operation we do not need to discretize the physical 
space, in others words, if the initial data is known continuously, this information can be 
kept. However, for simplicity, we already at this stage introduce a Cartesian uniform grid 
in the physical space. This is defined by the set J7 of M multi- indices of N'^, which is 
J = {j = (j»)ti, < j(0}^ where {J»} are some given bounds which represent the 
boundary points in the physical space. Next, the grid X of M'^ is given by 

X = {x,=jAx + b,jeJ}, (20) 

where d represents at the same time the dimension of the physical space and the dimension 
of the velocity space which are taken equal for simplicity, even if this is not necessary for 
the setting of the numerical method. Finally, 6 is a vector of M'^ which determines the 
form of the domain and Ax is a scalar which represents the grid step in the physical space. 
We consider a third discretization which is the time discretization t" = nAt. We will later 
in the paper introduce the time step limitations. 

We denote with /"^ the approximation /"^ ~ f{xj,Vk,tn) and with /j^^ the pointwise 

distribution value /"^i. = f{xj,Vk,tn) which are different, for conservation reasons, as 
explained next. In this notation, the discrete moments of the distribution / are 

U^ = {mkf;i,k^v)K- (21) 

The corresponding discrete equilibrium is denoted £k [f/"] , or equivalently by <f [U] , which 
is an approximation of Mj[C/"] and it will be also defined later. When the distribution 
function is truncated in velocity space, conservation of the macroscopic quantities is no 
longer possible. Thus, in order to restore the correct conserved variables we make use 
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of a simple constrained Lagrange multiplier method |19] . where the constraints are mass, 
momentum and energy of the solution. Let us recall the technique from [19] : Let be the 
total number of discretization points of the velocity space of the distribution function. We 
consider one space cell, the same renormalization of / should be considered for all spatial 
cells. Let 

7= (LL-.-JnY (22) 



be the pointwise distribution vector at t = and 

f = {hj2,...jNf (23) 

be the unknown corrected distribution vector which fulfills the conservation of moments. 
Let 

/ {Avr \ 

Cid+2)>cN = VkiAvY (24) 
V \vk\H^vy J 

and C/(d+2)xi = (p P'u E)'^ be the vector of conserved quantities. Conservation can be 
imposed using a constrained optimization formulation: 

Given Jg M^, C G M('^+2)x^, and U G m(^+2)x1, 

find / G such that (25) 
11/ — /II2 is minimized subject to the constrain Cf = U. 

To solve this constrain minimization problem, one possibility is to employ the La- 
grange multiplier method. Let A G M'^^^ be the Lagrange multiplier vector. Then the 
corresponding scalar objective function to be optimized is given by 

N ^ 

L{f, X) = Y^ \fk - /feP + X^iCf - U). (26) 
fc=i 

The above equation can be solved explicitly. In fact, taking the derivative of L(/, A) with 
respect to fk, for all A; = 1, A'" and Aj, for all i = 1, d + 2, that is to say the gradient 
of L, we obtain 

Or -1 

— = 0, = 1, iV ^ / = 7+ -C^A, (27) 
ojk 2 

and 

dL 

— = 0,i = l,...,d + 2^Cf = U. (28) 
Now, solving for A we get 

CC^X = 2{U -Cf), (29) 

and observing that the matrix CC^ is symmetric and positive definite, since C is the 
integration matrix, one deduces that the inverse of CC^ exists. In particular the value of 
A is uniquely determined by 

X = 2{CC^)-\U -Cf). (30) 
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Back substituting A into (27) provides 



f = f + C^(^CC^)-\U-Cf). (31) 

Observe that, following the same principle, we can impose the conservation of other macro- 
scopic quantities, in addition to mass, momentum and energy. A key point is that, 
in practice, we need to solve the above minimization problem only for the initial data 
/(xj, t'fc, t = 0) because once the conservation is guaranteed for f = 0, this is also guaran- 
teed for the entire computation because the exact solution is used for solving the transport 
step. The only possible source of loss of conservation for the entire scheme is the relax- 
ation step. This means that, for this step, we will need to impose conservation of the 
macroscopic quantities but only for the equilibrium distribution. 

The discretization of the Maxwellian distribution Mf{x,v,t), should satisfy the same 
properties of conservation of the distribution /, i.e. = {mkfj^j^A.v)ic = (mfciSfc[C/"] Af 
To this aim, observe that the natural approximation 

£k[U^] = Mf{xj,Vk,tn), k€)C, n>0, j €j (32) 

cannot satisfy these requirements, due to the truncation of the velocity space and to 
the piecewise constant approximation of the distribution function. Thus, the calculation 
carried out above for the definition of the initial distribution /, can be also performed for 
the equilibrium distribution Mf. This should be done each time we invoke the equilibrium 
distribution during the computation. The function £[U] is therefore given by the solution 



of the same minimization problem defined in (25), and its explicit value is given mimicking 



(31j) by 

£[U] = Mf[U] + C^{CC^)-\U - CMf[U]), (33) 

where Mf[U] represents the pointwise values of the Maxwellian distribution Mf[U] = 
Mf{xj,Vk,tn)- Notice that the computation of the new distributions / and £ only involves 
a matrix- vector multiplication. In fact, matrix C only depends on the parameter of the 
discretization and thus it is constant in time. In other words matrices C and C'^ {CC'^)~^ 
can be precomputed and stored in memory during the initialisation step. They are used 



during the simulation when the solution of system (25) is invoked. 

Another possibility to approximate the Maxwellian distribution Mj is proposed in |28] . 
In that work, the authors define ffc[f7] as the solution of a discrete entropy minimization 
problem 

H^{£[U]) =mm{H,c(.9),9>0£ such that {mg),c = U} . (34) 

This discretization (existence, uniqueness, convergence) has been mathematically studied 
in |28| . However, one drawback of this method, is the need for solving a non linear system 
of equations in each spatial cell for each time step. As we seek for efficiency, we only 



consider the first minimization strategy (25) to approximate the equilibrium distribution 
Mf. 
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4.2 Conservative and fast kinetic schemes FKS 



We can now present the full scheme. Let us start with the first-order splitting scheme and 
then, define the second-order in time method based on the Strang splitting strategy |37] . 
Let fjf^ be the initial data defined as a piecewise constant function in space and in 



velocity space, solution of equation (31) with /j'^ = f{xj,Vk,t = 0). We recall that the 
choice of a piecewise constant function in space is not mandatory for the method. Let 
also £jj^[U] be the initial equilibrium distribution solution of equation (33) with Mj/, = 
Mf{xj,Vk,t = 0). We start describing the first time step of the method [r^t^] starting at 
= 0, we further generalize the method to the generic time step starting from 



First time step Let us describe the transport and relaxation stages. 

Transport stage. We need to solve N linear transport equations of the form: 

dtfk + Vk-V.fk = 0, k = l,...,N, (35) 

where the initial data for each of the N equations is a piecewise constant function in the 
three dimensional space defined as 

7fc(x,t° = 0) = /0fc VxG [x,_i/2,x,+i/2], k = l,...,N. (36) 

The exact solution of the equations at time t^ = t^ + At = At is given by 

fkix,t') = fl{x) = f{x-VkAt), k = l,...,N. (37) 

Observe that, here, we do not need to reconstruct our function as for instance in the semi- 
Lagrangian schemes |17l [T8] , the shape of the function in space is in fact known and fixed 
at the beginning of the computation. Once the solution of the transport step is known, to 
complete one step in time, we need to compute the solution of the relaxation step. As in 
finite volume or finite difference methods, we solve the relaxation step only on the grid, 
thus only the value of the distribution function / in the centers of the cells are computed. 
Prom the exact solution of the function we can immediately recover these values at the 
cost of one simple vector multiplication. On the other hand, one notices that for classical 
finite difference or finite volume methods nested loops for each dimension in space and in 
velocity space are mandatory to compute the solution of the transport part. This makes 
the computational cost of these methods extremely demanding in the multidimensional 
cases. On the contrary, the computational cost of the method we propose is only of 
the order of the number of points in which the velocity space is discretized {0{N)). In 
particular, for uniform meshes, we only need to compute the new value of in the center 
of one single cell, to know the solution in the center of all others cells. 

Relaxation stage. For this step we need to locally solve on the grid, i.e. in the center 
of each spatial cell, an ordinary differential equation. Thus, we have to solve: 

dtfj,k = -i£j,k[U]- hk), k = l,...,N, j = l,...,M, (38) 
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where the initial data is the result of the transport step 

Jk{xj,t^) = rkixj), k = l,...,N, j = l,...,M. (39) 

Any discretization method in time for this term can be chosen, as for instance the preferred 
Runge Kutta method. However, being the above equation a first order linear ordinary dif- 
ferential equation, we choose to compute the exact solution. The last ingredient needed 
to perform the computation, is the value of the equilibrium distribution £ at the center of 
the cell after the transport stage. To this aim, observe that, the Maxwellian distribution 
does not change during the relaxation step, which means that during this step the macro- 
scopic quantities remain constants. This implies that only the transport stage possibly 
modifies the equilibrium distribution. In order to compute the Maxwellian, the macro- 
scopic quantities in the center of the cells, i.e. the density, the mean velocity and the 
temperature, are given by summing the local value of the discrete distribution / over the 
velocity set {rukf* f,Av)jc = Uj, j = 1, . . . ,M, where = f^ixj). Finally, the discrete 



equilibrium distribution at time = + At is the solution of equation ( 33 ) with moments 



uj, j = 1, . . . , M. We can now compute the solution of the relaxation stage by 



fl, = exp(-At/e)/* , + (1 - exp(-At/£))4,[[/]. (40) 

Observe that the above equation furnishes only the new value of the distribution at time 
= + At = At in the center of each spatial cell for each velocity Vk- However, what we 
need, in order to continue the computation, is the value of the distribution / in all points of 
the space. To overcome this problem, in classical discrete velocity methods several authors 
\28\ [3T] consider the distribution function constant in the cell as well as the Maxwellian 
distribution. The result is that they need to solve only an ordinary differential equation in 
the center of the cell taking the average value of the macroscopic quantities inside one cell. 
Here, we make a different approximation. We consider that the equilibrium distribution 
Mf has the same form as the distribution / in space. In other words £k is a piecewise 
constant function in space for each velocity Vk- The values of this piecewise constant 
function are the values computed in the center of the spatial cells, i.e. one defines 

£k{x, t^) = 4^, Vx s.t. 7fc(x, t^) = 7fc(x,, t^), j = l,...,M. (41) 

This further implies that the relaxation term writes in term of spacial continuous function 
7fc(x,t^) as 

Jk{^,At) = exp{-At/e)Jf,{x,t^) + {l-exp{-At/e))£k{x,t^)[U]. (42) 

For each velocity Vk this choice permits to keep the form of the distribution fk constant 
in space throughout the computation, and, as a consequence it drastically reduces the 
computational cost. This ends the first time step. 

We focus now on the time marching procedure for the first- and second-order splitting 
schemes which will allow to solve the Bolzmann-BGK equation. 
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Generic time step [t"; t"""^^]. We present a first-order and second-order Strang splitting 
teclinique [37J. 

First-order splitting: Given tlie value of the distribution function /^(x), for all k = 
1, . . . ,N, and all x gM'^ at time t", the value of the distribution at time i""*"^, (x), is 
given by 

fl{x) = f]:{x-VkAt), k = l,...,N (43) 



/r'(^) = exp(-At/e)/;(x) + (1 - exp(-At/e))£:r\x)[C/], k = l,...,N, (44) 
where Sf^ {x)[U] is a piecewise constant function, computed considering the solution of 



the minimization problem ( 33 ) relative to the moments value in the center of each spatial 



cell after the transport stage: Uj , j = 1, . . . , M. These moments are given by computing 
{mkf* f,Av)!c where is the value that the distribution function takes after the transport 
stage in the center of each spatial cell. 

Second-order splitting: Given the value of the distribution function (x), k = 1, . . . ,N, 
a; G M'^ at time t"', the scheme reads 

Tk{x) = fj:{x-VkAt/2), k = l,...,N (45) 



/r(x) = exp(-At/e)/,(x) + (l-exp(-At/e))£:*(x)[t/], k = l,...,N, (46) 
where £l.{x)[U] is a piecewise constant function, computed considering the solution of the 



minimization problem ( 33 ) relative to the moments values in the center of each spatial cell 
after the transport stage of size At/ 2. We call these moments Uj, j = 1, . . . ,M. They are 
given by the discrete summation {rukf* j^Av)ic where is the value that the distribution 
function takes after the transport stage in the center of each spatial cell. The last step 
consists of a second transport stage of half time step 

7T\x) = rk*i^-VkAt/2), k = l,...,N, (47) 

which ends the second-order splitting scheme. 



Remark 2 

• As already mentioned the choices of uniform meshes and piecewise constant func- 
tions in space are not necessary for the construction of the method. These choices 
have been made because we wanted to analyze the method in its simplest form. We 
postpone to future works the study of non-uniform meshes and different shapes of the 
distribution function f in space. However, a key point is that, even if the method in 
its general form is already much more faster than finite volume, finite difference or 
semi Lagrangian methods for kinetic equations, it can be made extremely fast in the 
case of uniform meshes as we will explain in the next paragraph. 
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For finite volume or finite difference methods applied to discrete velocity models 
of kinetic equations, the second-order time splitting implies the computation of the 
transport stage in two steps, from f" to t^+^l"^ and from to t"'"'"^. Conversely 

the same operation can he done with the relaxation step to get second order accuracy. 

In our method, extending the scheme from first- to second-order time splitting is 
almost as expensive as the first-order. In fact, except for the first time step in which 
we need to compute two times the transport operator with At/2, starting from the 
second time step we have to solve a sequence of two At/ 2 transport stages. However, 
being the transport computed exactly, solving the linear transport equations two times 
with At/ 2 or only one time with the entire At provides the same solution. This 
means that, in order to obtain second-order accuracy it is sufficient to solve the first 
time step with At/ 2 and then proceed as for the first-order method to obtain global 
second-order accuracy in time. 

However, any time splitting method does degenerate to first-order accuracy in the 
fluid limit, that is to say, when r — t- 0. 

Due to the fact that the relaxation stage preserves the macroscopic quantities, the 
scheme is globally conservative. In fact, at each time step, the change of density, 
momentum and energy is only due to the transport step. This latter, being exact, 
does preserve the macroscopic quantities as well as the distribution function. 

For the same reason, the scheme is also unconditionally positive. In others words, 

we observe that /^(x) > 0, for all n > 0, and k = 1, . . . , M if the initial datum is 
positive f1{x) > for all k = 1, . . . , M . In fact, the transport maintains the shape 
of f unchanged in space while the relaxation towards the Maxwellian distribution is 
a convex combination of Mf and f{x — VkAt) both being positive. 

We expect the scheme to perform very well in collisionless or almost collisionless 
regimes. In these cases in fact the relaxation stage is neglectible and only the exact 
transport does play a role. When moving from rarefied to dense regimes the projection 
over the equilibrium distribution becomes more important. Thus, the accuracy of the 
scheme is expected to diminish in fluid regimes, because the projection method is only 
first-order accurate. One possibility, for such regimes is to increase the order of the 
projection method towards the equilibrium. This possibility will also be analyzed in 
future works. 

The time step At is chosen as the classical CFL condition 

Observe that this choice is not mandatory, in fact the scheme is always stable for 

every choice of the time step, but being based on a time splitting technique the error 
is of the order of At or (At)^. This suggests to take the usual CFL condition in 
order to maintain the error small enough. 
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5 Analogies with particles methods 



In this section we first introduce a Monte Carlo particle method which permits to solve 
the Boltzmann-BGK equation. Next, we introduce its deterministic counterpart, i.e a 
deterministic particle method. Finally, we show that a slightly modified version of this 
latter method, in which the positions of the particles, instead of being randomly chosen, arc 
taken initially at the same position in space for all the cells, is equivalent to a FKS method 
where some specific choices of the discretization parameters are done. This analogy permits 
to derive a very convenient form of the algorithm which for this choice of the discretization 
parameters. 

The starting point of Monte Carlo methods is again given by a time splitting between 
free transport 

dtf + v-V^f = 0, (49) 

and collision, which in the case of the BGK operator is substituted by a relaxation towards 
the equilibrium 

dtf=^{f-Mf[U]). (50) 

In Monte Carlo simulations the distribution function / is discretized by a finite set of 
particles 

JV 

f = Y^mid{x-Xi{t))5{v-Vi{t)), (51) 

1=1 

where Xi{t) represents the particle position, Vi{t) the particle velocity and mi the particle 
mass which is usually taken constant. During the transport stage the particles move to 
their next positions according to 

Xiit + At) = Xi{t) + Vi{t)At, (52) 

where At is such that an appropriate CFL condition holds. This condition normally 
implies that one particle does not cross more than one cell in one time step. 

The collision step acts only locally, changes the velocity distribution but preserves the 
macroscopic quantities. In this CclS6j clS already explained, the space homogeneous problem 
admits the following exact solution at time t + At 



fit + At) = e-^'/^m + (1 - e-'''/-)Mf[U]{t). 



(53) 



Thus, in a Monte Carlo method, the relaxation step consists in replacing randomly selected 
particles with Maxwellian particles with probability (1 — e"^*/"^). This means 

„ r+ ^»(*)' witli probability e"^*/^ 

Vi[t + - I Mf[U]{v), with probability 1 - e"^*/^ ' ' 

where Mf[U]{v) in the above expression represents a particle sampled from the Maxwellian 
distribution with moments U. Observe that, second-order splitting can be used as well 
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in the Monte Carlo methods. As in the case of the FKS, because the transport step is 
resolved exactly, the change with respect to the first-order method is only the first time 
step which has to be computed with a time step of Ai/2. This will assure second-order 
accuracy in time except in the limit r — ?■ in which the method degenerates again to 
first-order accuracy. 

We introduce now a modified particle method which shares many analogies with our 
method. Instead of the continuous kinetic equation, this modified particle approach solves 
the discrete velocity approximation of the kinetic equation. In this method, the distribu- 
tion function is again represented by a piecewise constant function, defined on a compact 
support in the velocity space. The distribution function is approximated by a finite set of 
particles in each spatial cell as in the previous Monte Carlo method. The main difference 
with respect to the other particle method is that now the particles can attain only a dis- 
crete set of velocities and that the mass of each particle is no more a constant, instead it 
changes in time during the time evolution of the kinetic equation. These types of methods 
are known in literature as weighted particles methods |1H [26l [27] . Therefore we consider 

N 

f = ^mi{t)6{x - Xi{t)), 6{v-Vi{t)), Vi{t) = Vk, k e )C, (55) 
1=1 

where /C is the same set of multi-indices than the DVM discretization (this means that the 
number of particle is fixed equal to the number of points in which the velocity space is 
discretized). The BGK equation is again split into two stages: a transport and a relaxation 
stage. The transport part, as before, corresponds to the motion of the particles in space 



caused by their velocities (52). The main difference is in the solution of the relaxation 



part (50). In order to solve this equation from a particle point of view, we change the 

(56) 



mass of each particle using the exact solution of the relaxation equation, i.e. 



fit + At) = e-^*/V(t) + (1 - e-^*/^)M;[C/](t). 
this corresponds to 

xni{t + At) = e-'^'/^f{vi) + {l-e-^'/^)£{vi)[U], i = l,...,N. (57) 
Again in practice to avoid the loss of conservation of macroscopic quantities, once the 



conserved quantities are computed in one cell, we solve the minimization problem (33) 
to get the function £[U]. Thus, the above procedure requires the knowledge of Uj, j = 
1, . . . , M, which can only be estimated from the sample positions. The simplest method, 
which produces a piecewise constant reconstruction, is based on evaluating the histogram 
of the samples on the grid, considering all the samples inside one cell be of the same 
importance irrespectively of their positions. In practice, the density pj, j = 1,. . . ,M is 
given by the number of samples Nj. belonging to the cell Ij 
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while the mean velocity in each spatial direction and the energy are given by 



1 \ ^ 1 V ^ r, 

— > xriiVi, Ej = — — ^ I"' I 

* -J - CZT 



m-iVi, Ej = — — ^ tTiilfjp. (59) 



The method described above deserves some remarks. First, note that as r — )• the method 
becomes a particle scheme for the limiting fluid dynamic equations. This limit method is 
the analogous of a kinetic particle method for the compressible Euler equations. Second, 
the simple splitting method described is first-order in time. Second order Strang splitting 
can be implemented similarly to the case of the FKS scheme described in the previous 
section. 

Now, we dispose of all the elements which permit to highlight the similarities with 



the FKS scheme. Observe that the relaxation step (57) is no more solved statistically 



as for the original Monte Carlo method (54). Thus, the scheme described is in fact a 



deterministic particle scheme, in which, however, the particle positions are still randomly 
initialized. Now, if we consider the piecewise reconstruction of the macroscopic quantities 



introduced before (58 59), we take one single particle for each velocity Vk,k ^ )C and we 
fix all particles positions at the beginning of the computation at the center of each cell we 
obtain the FKS described in the previous section. In fact, first the number of particles 
in each spatial cell remains constant in time and equal to the number of mesh point in 
velocity space N. This is because for each particle that goes out of one cell, there exists 
another particle with the same velocity which enters in the cell from another location. 
This is due to the fact that particles have initially the same position, they never change 
velocity and the mesh is uniform. Thus, during the time evolution the only quantity that 
is modified is the mass of the particle. This mass changes according to the solution of the 



relaxation equation (57). This is exactly what happens in the FKS method in equation 
(44). Finally, the transport is solved exactly for the particle scheme as well as for the 
FKS method. However, the weighted particle scheme, can be viewed as a particular case 
of the FKS method. In fact, to regain the weighted particle method, we have to fix the 
position of the particles, take only a single particle for a given velocity v^, the mesh must 
be uniform and the shape of the distribution function in space must be piecewise constant 
for the FKS method. This analogy between the two schemes permits, from one side, to 
derive a very efficient algorithm for the FKS method. From the other side, it opens the 
way to in deep discussions from the theoretical point of view on the relation between the 
two methods , like the different convergence properties of the two approaches. We remind 
to a future work for an analysis of the convergence of the FKS method. 



6 Numerical tests 
6.1 General setting 

In this section, we present several numerical tests to illustrate the main features of the 
method. First the performance of the scheme is tested in the one dimensional case for 
solving the Sod problem. In this case, we do comparisons of our method with different 
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finite difference methods whicli can solve tlie same problem. In the one dimensional case, 
the computational speedup is not very relevant being all classical methods sufficiently fast. 
However, the FKS method is still faster than the other methods. In a second series of 
tests we solve a two dimensional- two dimensional kinetic equation. Finally we solve a full 
three-three dimensional problem. In this situation, it is a matter of fact that computing 
the solution of a kinetic equation with finite difference, finite volume or semi-Lagrangian 
methods is unreasonable. We will show results from our method running on a mono- 
processor laptop machine. 

6.2 ID Sod shock tube problem 

We consider the ID/ID Sod test with 300 mesh points in physical and 100 points in 
velocity spaces. The boundaries in velocity space are set to —15 and 15. The left and 
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right states are given by a density pi = 1, mean velocity = and temperature Tl = 5 
if < X < 0.5, while pR = 0.125, ur = 0, Tr = A if 0.5 < x < 1. The gas is in 
thermodynamical equilibrium. We repeat the same test with 4 different values of the 
Knudsen number, ranging from r = 10~^ to r = 10~^. We plot the results for the 
final time tgnai = 0.05 for the density (Figure [T]), the mean velocity (Figure [2]) and the 
temperature (figure [3]) . In each figure we compare the FKS method with a third order 
WENO method, a second-order MUSCL method and a first-order upwind method [23]. 
These numerical methods used as reference, employ the same discretization parameters, 
except for the time step which for stability reason is chosen equal to At/2 for the WENO 
and second-order MUSCL schemes, where At is the time step of the fast DVM method 
given by ( 48 ) . 

From Figures ([l]) to ([3]) we can observe that our method gives very similar results to 
the two high order schemes for r = 10~^, r = 10~^ and r = 10"'^ while for r = 10~^, 
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Figure 3: ID Sod test: solution at ifinai = 0.05 for the temperature, with r = 10 ^ (top 
left), r = 10~^ (top right), r = 10~^ (bottom left) and r = 10"'^ (bottom right). 



the scheme is more diffusive than the second and third order scheme but it still performs 
better than the first order method. The behaviors of the method for different regimes are 
due to the fact that for collisionless regimes the FKS gives almost the exact solution, this 
means that it is more precise than the third and second-order methods. When the gas 
becomes denser the projection towards the equilibrium, which is only first-order (second 
step of the method (44)), does reduce the accuracy of the method. Notice that high 
order reconstruction of the equilibrium distribution could also be considered to increase 
the global accuracy in such case. However, a key point of the FKS is its low CPU time 
consumption in comparison to other existing methods. In the case r = 10~^ for which the 
scheme exhibits diffusive behaviors, a comparison between the third order WENO method 
and our FKS method is carried out for a fixed CPU time. In other words, we consider 
for a given total computational time, which method gives better results. Thus, we solve 
the problem with 200 points in space and 100 in velocity space for the WENO method 
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and we consider an FKS solver which employs 100 points in velocity space. In order to 
have the same computational time for the two methods, we can afford 1000 points for 
the FKS. The two results are compared in Figure |4j We observe that, in this situation, 
the FKS method gives more accurate solutions, in particular for the shock wave (see the 
zooms in the figures). Finally, observe that the gain in term of computational time is not 
so relevant for the one dimensional case, while it becomes very important for the two and 
the three dimensional case. In the later case, the difference is about being able to do or 
not to do the computation in a reasonable amount of time on a single processor machine. 





Figure 4: ID Sod test: solution at tgnai = 0.05 for the density, the mean velocity and the 
temperature with r = 10^^. Comparison of solutions for the same computational time and 
different meshes. WENO 200 points (dashed line) and Fast DVM 1000 points (straight 
line) . 



6.3 2D Sod shock tube problem 

We consider now the 2D/2D Sod test on a square [0, 2] x [0, 1]. The velocity space is also 
a square with bounds —15 and 15, i.e. [—15, 15]^, discretized with Ny = 20 points in each 
direction which gives 20^ points. We repeat the same test using different x Ny meshes 
ranging from Nx = Ny = 25 to = Ny = 200. The domain is divided into two parts, a 
disk centered at point (1, 1) of radius Rd = 0.2 is filled with a gas with density pL = 1, 
mean velocity ul = and temperature = 5, whereas the gas in the rest of the domain 
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x-velocity 
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Figure 5: 2D Sod test: solution at tunai = 0.07 for the density (top left), the velocity in the 
x-direction (top right), the velocity in the y-direction (bottom left) and the temperature 
(bottom right). 



is initiated with pji = 0.125, un = 0, Tr = 4. The final time is fgnai = 0.07. The gas is in 
thermodynamical equilibrium during all the computation which means that we fix r = 0. 
In practice, we are using the kinetic scheme to compute the solution of the compressible 
Euler equation. We recall that, as seen in the previous section, this is the case in which 
the FKS scheme gives the worse results, this is due to the first order accurate projection 
towards the local Maxwellian distribution. However, this choice permits to compare our 
results with a numerical method for the compressible Euler equations, being as already 
stated, computationally very demanding to perform simulations of kinetic equations in the 
two dimensional case and considerably more demanding in the three dimensional case. 

In Figure [5] we show the results for respectively the density, the mean velocity in the 
x-direction and in the y-direction and the temperature using a 200 x 200 mesh. In Figure [6] 
we report the profile for x = 1 of the same macroscopic quantities comparing the results 
to a first order and to a second order MUSCL scheme for the compressible Euler equations 
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Figure 6: 2D Sod test: solution (continuous line) at tgnai = 0.07 and x = 1 for the density 
(top left), the velocity in the x-direction (top right), the velocity in the y-direction (bottom 
left) and the temperature (bottom right). Comparisons with first order and second order 
MUSCL methods (dotted hues) 



|23| . We clearly see that, as in the ID case, the accuracy of the FKS method lies between 
the first and the second order accuracy in the limit r — )• 0. We expect the accuracy to be 
highly improved when the gas is far from the thermodynamical equilibrium as in the one 
dimensional case. 

In table[T]we report the CPU time T of these simulations, the CPU time per time cycle 
Tcycie, the CPU time per cycle per cell Tceii and the number of cycles needed to perform 
the computation for different meshes in space and a fixed mesh in velocity. As expected 
the number of time step linearly scales with the size of the spatial mesh at fixed velocity 
mesh (factor 2 when the cell number is multiplies by 4). The CPU time is very small 
compared to classical kinetic schemes, in less than 10 minutes the simulation of the Sod 
shock tube on a 200 mesh is computed. Finally we observe that the CPU time per cycle 
per cell is almost constant which allows to predict the end of the simulation and its cost 
beforehand. 
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Cell V # 


Cell X # iVc 


Cell XXV # Ntot 


Cycle 


Time 


Time/cycle 


Time/cell 


A^^ Bounds 


AT^ X TVj, 


N^x NyX K"^ 


-Acycle 


T(s) 


^cyclc (s) 


^ccU (s) 




25 X 25 


25 X 25 X 20^ 


13 


2s 


0.1538 


2.46 X 10"^ 




= 625 


= 250000 










To" 


50 X 50 


50 X 50 X 20^ 


25 


8s 


0.32 


1.28 X lO""* 




= 2500 


= 10^ 










100 X 100 


100 X 100 X 20^ 


50 


60s 


1.2 


1.20 X lO""* 




= 10000 


= 4 10^ 




Imn 








200 X 200 


200 X 200 X 20^ 


100 


490s 


535.75 


1.22 X lO-"* 




= 40000 


= 16 10^ 




~ 8mn 







Table 1: 2D Sod shock tube. The time per cycle is obtained by Tcyde = T/Ncyde and the 
time per cycle per cell by T^cii = T/iVcycie/^c- 



6.4 Numerical validation of the 3D/3D fast FKS method 

Here we report some simulations of the full 3D/3D problem. We consider only the case in 
which r = 0, which means, we project towards equilibrium at each time step, this is the 
fluid limit. We recall that, in this regime, the numerical method gives the worst results in 
terms of precision, on the other hand, exact solution are known and this permits to make 
fair comparisons. For all the other regimes, the performances of the method are better as 
shown in the previous section. 

The FKS method has been implemented in fortran on a sequential machine. The goal is 
to numerically show that such a kinetic scheme can reasonably perform on six dimensions 
on a mono-processor laptop. All simulations have been carried out on a HP EliteBook 
8740W Intel(R) Core(TM) i7 Q840@1.87GHz running under a Ubuntu (oneiric) version 
11.10. The code has been compiled with gfortran 4.6 compiler with -03 optimization flags. 

Otherwise noticed the velocity space is [—15, 15]^ or [—10, 10]^ and is discretized with 
Ny = 13 or N-u = 12 grid points in each velocity direction, leading to A'^^ = 2197 or 
1728 mesh points. The time step is fixed to 95% of the maximum time step allowed, as 



prescribed by the CFL condition (48), apart from the last time step which is chosen to 
exactly match the user-given final time. Symmetric boundary conditions are considered. 

The Sod shock tube in ID is run as a sanity checks in order to validate the imple- 
mentation of the method and show its ability to reproduce ID results with a 3D run. 
Then the Sod problem in 3D is simulated to show the performances of the FKS algorithm 
and further compared to a reference solution. For each simulation we report the memory 
consumption, the full CPU time and the CPU time cost per cell per time step. Some 
extrapolation of these results are also made to measure the efficiency of this method. 



6.4.1 ID Sod shock tube problem: A sanity check 

The ffist sanity check consists of running the ID Sod shock tube in x direction on x 2 x 2 
cubes. The initial data are the same as for the ID problem previously run. The final time 
is tfinai = 0.1. In our numerical experiments the computational domain is of size 1 in x 
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50x2x2 mesh — i — 
100x2x2 mesh 




200x2x2 mesh 

400x2x2 mesh 
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(a) 
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(b) 



50x2x2 
100x2x2 mesh — - 
200x2x2 mesh 
400x2x2 mesh 




50x2x2 mesh 
100x2x2 mesh 
200x2x2 mesh 
400x2x2 mesh 



0.4 0.6 




(d) 



Figure 7: 3D-1D Sod problem at ignai = 0.1 for 50, 100, 200, and 400 cells in x direction 
and 2 in y and z directions — Panels (a), (c), (d): Density, velocity and temperature as 
a function of x vs exact solution (straight line) — Panel (b) : 3D view of colored density 
for a 200 x 3 x 3 mesh. 



direction leading to Ax = l/N^. We set Ay = Az = Ax. Four successively refined meshes 
in X direction are utilized, = 50, 100, 200, and 400, in order to observe the convergence 
of the numerical method towards the exact solution. 

In Figure [7] we display the density, the velocity and the temperature vs the exact 
solution with solid line (respectively panels (a), (c) and (d)) and a 3D view on the mesh 
cells colored by density (panel (b) where a 200 x 3 x 3 mesh is used for figure scaling 
reasons). The first observation is the perfect symmetry in the ignorable directions y and z 
as all cells are plotted (notice that the results for a x 5 x 5 cells mesh exactly match the 
Nx X 2 X 2 results). The second obvious observation is the convergence of the numerical 
solution towards the exact solution when the mesh is refined. These results assess the 
ability of the method and the code to reproduce ID results without alteration. 

In table [2] we gather the number of cycles A'cycie) the CPU time T of these simulations 
and display the CPU time per time cycle Tcyde and the CPU time per cycle per cell 
Tceii. As expected, the cycle number and the CPU time per time cycle scales with the cell 
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Cell # 

N,^ xNyXN.x 
= N,x 


Cycle 

-^cyclc 


Time 

r(s) 


Time/cycle 

^cyclc (s) 

Transp. Coll. 


Time /cycle / cell 

rcell(s) 


Memory 

Mem(MB) 


50 X 2 X 2 
= 200 X 13^ = 439400 


81 


18 


0.22 

0.07% 99.93% 


1.11 X 10"^ 


0.660 


100 X 2 X 2 
= 400 X 13^ = 878800 


160 


68 


0.43 

0.05% 99.95% 


1.07 X 10-3 


0.704 


200 X 2 X 2 
= 800 X 13^ = 1757600 


318 


276 


0.87 

0.03% 99.97% 


1.08 X 10-3 


0.812 


400 X 2 X 2 
= 1600 X 13^ = 3515200 


634 


1071 


1.69 

0.02% 99.98% 


1.06 X 10-3 


1.000 



Table 2: ID Sod shock tube run with the 3D/3D FKS method. The time per cycle is 
obtained by Tcyde = T/Ncyde and the time per cycle per cell by Tceii = T/A^cycie/-^c- The 
relative percentage of the cost of the transport and relaxation stages are provided. For 
our FKS method the transport stage costs almost nothing. 



number and, consequently, the CPU time per cycle per cell is almost constant. This allows 
to almost exactly predict the duration of a simulation knowing the cell number. Moreover 
we have provided the relative percentage of the cost of the transport and collision stages. 

As expected the transport stage does not cost anything, in absolute value, especially 
when the number of cells increases. In fact, for computing the solution of this stage in 
all domain, we consider the evolution of the distribution function / in one single cell, the 
same happens in the other cells. This means that the cost of this stage is proportional to 
the mesh points in the velocity space. On the other hand, in finite volume methods 
as well as Monte Carlo method the cost to solve this stage is proportional to N^Nc with 
Nc = NxNyNz and, obviously this scales with Nc- Another satisfactory result is the 
memory storage Mem in MB (or Mo) of the method which is very low because we never 
have to store the distribution function values for more than points, leading to store 
133 X 7 reals, say ~ 0.123MB independently of Nc- Conversely the storage of the Monte 
Carlo method scales with the cell number Nc- Finally as expected the time T scales with 
a factor 4 for twice the number of cells. 

6.4.2 3D Sod shock tube problem 

The 3D Sod shock tube has been run with the 3D/3D FKS method. The left state of 
the ID Sod problem is set for any cell c with cell center radius Tc < 1/2, conversely the 
right state is set for cell radius Tc > 1/2. The final time is tfinai = 0.1. The domain is 
the unit cube and the mesh is composed of A^^ x Nx x Nx cells with Ax = 1/A^x and 
Ax = Ay = Az. The problem is run with Nx = 50 (125000 cells), A^^ = 100 (1 million 
cells) and Nx = 200 (8 millions cells). The velocity space is either [—10; 10] discretized 
with 123 points, or [—15; 15] discretized with 133 points. This leads to consider up to 
2003 X 133 ~ 17.7 milliards cehs. In Fi gure M the density is plotted as a function of 
the radius (left panel) and the colored density on a 3D view (right panel) for A^^, = 50 
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Cell # A^c X iV3 


Cycle 


Time 


Time/cycle 


Time/cell 


Mem 


Bounds 


xNyXN.x iV3 


-'^cycle 


T(s) 


^cyclc (s) 


TccU (s) 


(MB) 


To" 


253 X 133 


32 


346s 


10.81 


6.92 X 10""^ 


2.4 


1 — 1 


= 3.4328125 x 10^ 




(5.76mn) 








503 X 13a 


81 


4900s 


60.50 


4.84 X 10""^ 


15.5 




= 274.625000 x 10^ 




(1.36h) 










100 X 133 


160 


85720s 


535.75 


5.36 X lO^'' 


115.5 




= 2.1970 X 10^ 




(23.8h) 








extrapol. 


200 X 133 
= 1.7576 X 10^° 


320 


~ 1.4 X lO^s 
(16d) 


~ 4400 


5.5 X lO"'* 


~ 900 




253 X 123 


27 


218s 


8.07 


5.17 X 10^"^ 


2.3 


123 


= 27 X 10^ 




(3.63mn) 








503 X 


54 


2702s 


50.03 


4.00 X 10""^ 


15.4 




= 125 X 103 




(45mn) 










1003 X 123 


107 


38069s 


355.79 


3.56 X 10-"^ 


115.4 




= 1.728 X 10^ 




(10.57h) 








extrapol. 


2003 X 123 
= 1.3284 X 10^° 


214 


~ 633440s 
(7d) 


~ 2960 


3.7 X lO"'' 


~ 900 



Table 3: 3D Sod shock tube. The time per cycle is obtained by Tcyde = T/A'^cyde and the 
time per cycle per cell by Tceii = T /Ncyde/^c- The lines marked with extrapol. have been 
extrapolated by fixing Nc, N^ydc and Tceii- 



(middle panels) and Nx = 200 (bottom panels). The two different choices for the bounds 
and the mesh points in velocity space do not significantly change the results hence only 
the solution with bounds [—10; 10] and with 123 ^hq^y^ points is reported. The reference 
solution is obtained with a 2D axisymmetric compatible staggered Arbitrary-Lagrangian- 
Eulerian code [25j with 1000 cells in radial and 20 cells in angular directions. Moreover in 
Figure [s] (top panel) we present the convergence of the density as a function of cell center 
radius for all cehs for the 50 x 50 x 50, 100 x 100 x 100 and 200 x 200 x 200 cells meshes. 
These curves are compared to the reference solution in straight thick line and they show 
that the results are converging towards the reference solution. In table [3] we gather the 
number of time steps and the total CPU time T for 503 and 1003 cell meshes for the 
two different configurations: one with A'^y = 13 and the velocity space [—15,15] and the 
second one with A'^y = 12 and the velocity space [—10, 10]. For the 503 mesh the simulation 
takes 45 minutes or 1.36 hour depending on the configuration. For the finer 1003 mesh 
the simulation takes either 11 hours or 24 hours The memory consumption ranges from 
124Mb to 924Mb depending on the configurations and it scales with the number of cells 

A^e. 

Then, we compute the cost per cycle Tcyde and per cycle per cell Tceii- One observe 
that the cost per cycle per cell is an almost constant equal to 4 x 10~^s or 5.5 x 10~'*s. 
The extrapolation of the CPU time T for a 2003 mesh at Tceii fixed leads to one or two 
weeks computation for the two configurations and a memory storage of about 900MB. In 
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Converged solution 

200x300x200 mesh 






^ 1 ■ 



Density 




Density 



Figure 8: Sod problem at tfinai = 0.1 for iV^ x A^^ x cells (for = 50, 100, 200) for the 
velocity space [—10; 10] discretized with 12'^ mesh points. — Top: Convergence of density 
as a function of cell center radius for all cells vs converged solution (straight thick line) 
for the three meshes with zooms on contact and shock waves. Left: Density as a function 
of cell center radius (middle: Nx = 50, bottom: = 200) Right: 3D view of density 
on the unit cube = 50 (middle) and A^^ = 200 (bottom) (the mesh is only shown for 
Nx = 50). 
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Figure 9: Left: Log of the CPU time consumption for the 3D Sod problem at tgnai = 0.1 
as a function of (for N x N x N cell meshes) on a single processor laptop The red/blue 
squares are taken from Table [sj the thick red /blue curves are the extrapolation curve from 
Tccu- The horizontal lines corresponding to one hour, one day, week, month and year are 
also plotted. N = 100 corresponds to the 'one million cells' in space — Right: Log/Log 
scale. 



Figure|9]we plot the CPU time (red or blue symbols for each configuration and mesh points 
of the velocity space) and the extrapolation curves CPU{Nx, NcTceii) = -^f^NcTceii for 
the 3D Sod problem up to time ifinai = 0.1 for single processor laptop computation on a 
fixed mesh in velocity space of Ny = 12^ points. We deduced that the FKS method can 
be used at most on a single processor machine up to a 200 x 200 x 200 cells for roughly 
one week of computation. One also notices that the CPU time linearly scales on a log/log 
graph as expected (right panel of Figure [9]) 

7 Conclusions 

In this work we have presented a new super efficient numerical method for solving kinetic 
equations. The method is based on a splitting between the collision and the transport 
terms. The collision part is solved on a grid while the transport linear part is solved exactly 
by following the characteristics backward in time. The key point is that, conversely to 
semi-Lagrangian methods, we do not need to reconstruct the distribution function at each 
time step. In this first paper, we have presented the basic formulation of this new method 
for the BGK equation: Uniform meshes, piecewise constant discretization of the velocity 
space and a simple projection towards the equilibrium distribution have been considered. 

The numerical results show that the method is incredibly fast. We are now able to 
perform numerical simulations of the full six dimensional kinetic equation on a single pro- 
cessor machine in several hours. This important result opens the gate to extensive realistic 
numerical simulations of far from equilibrium physical models. Concerning the precision 
of the method, we observed, as expected, that the fast kinetic scheme (FKS) is more dis- 
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sipative close to the fluid regime and very precise for gases far from the thermodynamical 
equilibrium. 

In the future we would like to extend the method to non uniform meshes, more ad- 
vanced boundary conditions and different discretization of the velocity space. One expects 
with this last point to increase the accuracy of the schemes without losing its attractive 
efficiency. To avoid the loss of accuracy close to the fluid limit, we want to couple the 
FKS method to an high order solver for the system of equations which describes the fluid 
limit. Finally, we want to extend the method to other kinetic equations as the Boltzmann 
or the Vlasov equation. 
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